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THEEFFORT TO COUNT THE 
PANDEMIC S GLOBAL DEATH TOLL 


Official data report some five million COVID-19 deaths in two years, but global excess deaths 
are estimated at double or even quadruple that figure. By David Adam 


ast year’s Day of the Dead marked a 
grim milestone. On 1 November, the 
global death toll from the COVID-19 

: illi fficial 


public-health experts are striving to narrow 
the uncertainties for a global estimate of 
pandemic deaths. These efforts, from both 


academics and journalists, use methods 


excess deaths — or between 2 and 4 times the 
pandemic’s official toll so far (see go.nature. 
com/3qjtyge and ‘Global toll’). 


data suggested. It has now reached 
5.5 million. But that figure is a signifi- 
cant underestimate. Records of excess 
mortality — a metric that involves 
comparing all deaths recorded with those 
expected to occur — show many more people 
than this have died in the pandemic. 
Working out how many more is a complex 
research challenge. It is not as simple as just 
counting up each country’s excess mortality 
figures. Some official data in this regard are 
flawed, scientists have found. And more than 
100 countries do not collect reliable statistics 
on expected or actual deaths at all, or do not 
release them in a timely manner. 
Demographers, data scientists and 
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ranging from satellite images of cemeteries 
to door-to-door surveys and machine-learning 
computer models that try to extrapolate 
global estimates from available data. 
Among these models, the World Health 
Organization (WHO) is still working on its 
first global estimate, but the Institute for 
Health Metrics and Evaluation in Seattle, 
Washington, offers daily updates of its own 
modelled results, as well as projections of 
how quickly the global toll might rise. And 
one of the highest-profile attempts to model 
a global estimate has come from the news 
media. The Economist magazine in London 
has used a machine-learning approach to pro- 
duce an estimate of 12 million to 22 million 
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discrepancy the size the population of Swe- 
den. “The only fair thing to present at this 
pointis avery wide range,” says Sondre Ulvund 
Solstad, a data scientist who leads The Econo- 
mist’s modelling work. “But as more data come 
in, we are able to narrowit.” 

The scramble to calculate a global death toll 
while the pandemic continues is an exercise 
that combines sophisticated statistical mod- 
elling with rapid-fire data gathering. Everyone 
involved knows any answer they provide will 
be provisional and imprecise. But they feel it 
is important to try. They want to acknowledge 
the true size and cost of the human tragedy of 
COVID-19, and they hope to counter mislead- 
ing claims prompted by official figures, such 
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as China’s count of just under 5,000 COVID-19 
deaths or Russia’s of just over 300,000. 


Flawed figures 


Death and taxes are famously the only certain- 
ties in life, but countries account for each of 
them in vastly different ways. Even superficially 
similar places can have varying approaches 
to recording COVID-19 deaths. Early in the 
pandemic, countries such as the Netherlands 
counted only those individuals who died in hos- 
pital after testing positive for the coronavirus 
SARS-CoV-2. Neighbouring Belgium included 
deaths in the community and everyone who 
died after showing symptoms of the disease, 
even if they weren't diagnosed. 

That is why researchers quickly turned 
to excess mortality as a proxy measure of 
the pandemic’s toll. Excess-death figures 
are seemingly easy to calculate: compare 
deaths during the pandemic with the average 
recorded over the previous five years or so. But 
even in wealthy countries with comprehensive 
and sophisticated systems to report deaths, 
excess-mortality figures can be misleading. 
That’s because the most obvious way to cal- 
culate them can fail to account for changes in 
population structure. 

“We should be careful about this issue, 
because looking at the average raw datais really 
flawed,” says Giacomo De Nicola, a statistician 
at Ludwig Maximilian University of Munich, 
Germany. 

When De Nicola and colleagues worked 
ona 2021 study to calculate excess mortality 
caused by the pandemic in Germany, they 
found that comparing deaths to average 
mortality in previous years consistently under- 
estimated the number of expected deaths, and 
so overstated excess deaths’. The reason was a 
rise in annual national mortality, contributed 
to by a surge inthe number of people aged 80 
and above — a generation too young to fight 
and die in the Second World War. 

The difference for Germany is significant. 
Press-released raw data from the German sta- 
tistical office last year reported 5% more deaths 
in 2020 compared with 2019. But after taking 
the age structure into account, De Nicola’s 
group reduced this to just 1%. “Due to the lack 
ofa generally accepted method for age-adjust- 
ment, I’m pretty certain this issue extends to 
many more countries,” he says. 

Some demographers agree. “It concerns me 
that some so-called excess-deaths estimates by 
national statistical offices just use an average 
of the past five years of deaths as the expected 
deaths. In ageing populations, this is unlikely 
to be the best estimate,” says Tom Wilson, a 
demographer at the University of Melbourne, 
Australia. Responding to De Nicola’s work, Felix 
zur Nieden, a demographer at Germany's sta- 
tistical office, says he agrees that raw numbers 
should be adjusted to take age structure and 
other subtleties into account. 


More-sophisticated analyses adjust the 
expected deaths baseline to account for 
such biases, for example by raising the num- 
ber of expected deaths as a population ages. 
Probably the most comprehensive of these 
excess-mortality estimates come from Ariel 
Karlinsky, an economist at the Hebrew Univer- 
sity of Jerusalem in Israel, and Dmitry Kobak, 
a data scientist at the University of Tübingen, 
Germany. 

Since January 2021, Karlinsky and Kobak 
have produced a regularly updated database of 
all-cause mortality before and during the pan- 
demic (2015-21) from as many sources and for 
as many places as possible? — currently some 
116 countries and territories. Called the World 
Mortality Dataset (WMD), the bulk of the infor- 
mation comes from official death statistics col- 
lected and published by national offices and 
governments. The duo then works with these 
data to estimate excess mortality, including 
trying to take into account death tolls associ- 
ated with armed conflict, natural disasters and 
heatwaves. For example, they assumed that 
4,000 lives were lost in both Armenia and Azer- 
baijan during the 2020 Nagorno-Karabakh war. 

Karlinsky, who previously worked on health 
economics, recognized that even the best 
epidemiological models were based on offi- 
cial reported COVID-19 numbers that, for many 
places, were clearly too low or missing entirely. 
“Many people had been throwing around their 
conjectures about excess mortality without 


“The data are areal mess 
and so any modelling 
effort is going tobe 

very speculative.” 


basing it on data,” he says. 

In many cases, Karlinsky and Kobak’s esti- 
mates of excess deaths diverge significantly 
from COVID-19 mortality statistics released 
by governments. Russia, for instance, reported 
more than 300,000 COVID-19 deaths by the 
end of 2021, but is likely to have exceeded 
1 million excess deaths in that time. 

For countries covered by the WMD, official 
figures suggest that 4.1 million deaths since the 
start of the pandemic are down to COVID-19 
— around 10% of all deaths during that time. 
But the duo’s calculations suggest that, when 
excess mortality is taken into account, deaths 
related to COVID-19 are 1.6 times greater, at 
around 6.5 million deaths (or 16% of the total). 
In some countries, the relative impact of the 
virus is even higher. One-third of all deaths in 
Mexico can be attributed to the virus, Karlinsky 
and Kobak’s data suggest. 

Excess deaths include mortality that is not 
related to COVID-19, such as other infectious 
diseases, as well as indirectly related deaths, 
suchasa person with cancer who died because 


GLOBAL TOLL 


By January 2022, there had been 5.5 million official 
COVID-19 deaths worldwide in the pandemic. But 
models estimate that there have been between 
two and four times that number of excess deaths 
— that is, mortality above what was expected — 
since the start of 2020. 
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their screening was cancelled owing to the 
pandemic’s impact on health-care systems. 
Some countries, such as New Zealand, even 
had negative excess mortality, because they 
experienced few losses to COVID-19 and saw 
adropin deaths from influenza. But Karlinsky 
argues that, overall, data show that estimat- 
ing excess deaths is a reliable way to measure 
COVID-19 casualties. 


Modelling global deaths 


The WMD lacks excess-death estimates for 
more than 100 countries, including China, 
India and many in Africa. That’s because those 
countries either do not collect death statistics 
or do not publish them speedily. But they also 
account for millions of COVID-19 deaths. A true 
pandemic global death toll cannot be counted 
without those data, but some researchers 
argue it is possible to model one. 

Such an estimate has been produced for a 
pandemic before — for influenza. Starting in 
the Americas in March 2009, a type of HIN1 
influenza A virus ravaged the world for more 
than a year. By the time the WHO declared that 
pandemic over in August 2010, the organiza- 
tion’s ‘official’ toll, made up of laboratory-con- 
firmed deaths, was less than 19,000. 

Ateam of international public-health experts 
took a different approach. Starting with 
estimated influenza deaths in 20 countries, 
together covering more than one-third of the 
world’s population, the researchers looked for 
factors that could explain why some of these 
countries fared better or worse than others. 
They found ten indicators, including popula- 
tion density, number of doctors and income. 
The relationship between these contributing 
factors and deaths for a given country allowed 
them to model how many flu deaths they would 
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Feature 


RICH AND POOR 


Official figures suggest that wealthy countries had the highest number of deaths 
per capita during the pandemic. But a model that estimates excess deaths suggests 
that is false: lower middle-income countries might have been hit hardest. 
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expectin other countries, purely onthe basis of 
acountry’s performance on these indicators’. 

Their study suggested that between 123,000 
and 203,000 people died in the pandemic in 
the last 9 months of 2009 — about 10 times the 
WHO count. In 2019, the same team repeated 
the exercise to model deaths from seasonal 
flu epidemics from 2002 to 2011, starting 
this time with data from 31 countries. They 
reported that an average of 389,000 respira- 
tory deaths (uncertainty range 294,000 to 
518,000) were associated with flu globally 
for each year modelled’. 

The same method should work for COVID-19, 
says Cécile Viboud, an epidemiologist at the 
National Institutes of Health in Bethesda, Mar- 
yland, who worked on the 2019 influenza study. 
“We have much more data [for COVID-19] than 
we did with flu. So, ina way it is cleaner.’ Unlike 
with flu, it should be much easier to attribute 
respiratory deaths to the COVID-19 pandemic, 
she says, because the circulation of almost 
every other respiratory pathogen was stopped 
owing to lockdowns and other measures. 
“Statistically, it’s a much easier proposition,” 
Viboud says. 

The model used by The Economist to track 
the COVID-19 pandemic uses machine learning 
to identify more than 100 national indicators 
that seem to correlate with excess deaths in 
more than 80 countries where data are avail- 
able. These features include official deaths, 
the scale of COVID-19 testing and the results 
of antibody surveys, but also geographical 


latitude, the degree of Internet censorship 
and the number of years a country has been a 
democracy. It is possible to examine the impor- 
tance of each indicator in the model, but this 
is far from straightforward — features can act 
incombination, and their relative importance 
might differ for countries that have different 
characteristics, says Solstad. 

Plug numbers for these indicators for a 
country that doesn’t produce mortality data 
into the model, and algorithms estimate that 
country’s excess deaths. The model estimates 
some 5 million deaths in India, for example, 
10 times higher than the country’s official 
COVID-19 toll of less than 500,000 deaths. 
That estimate is sadly plausible — using sam- 
ple surveys of households and sub-national 
mortality data, academic groups have sepa- 
rately estimated that as many as 3 million to 
5 million people might have died from COVID- 
19 in India**. The Economist's algorithm has a 
wide uncertainty interval of between 1 million 
and 7.5 million deaths for India. 

For China, the model estimates almost 
750,000 deaths (well over 150 times higher 
than the country’s reported 4,600), but with 
a wide uncertainty interval ranging from as 
low as 200,000 fewer deaths than expected, 
to as high as 1.9 million excess deaths. Some 
researchers think that although China’s report 
of only 4,600 deaths is probably an under- 
estimate, The Economist's central estimate 
overstates the real number. COVID-19 deaths 
could well have been under-reported there in 
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the first few frantic months of the pandemic, 
Karlinsky and other researchers say, but proba- 
bly only by a factor of two or three. Since then, 
China’s strict zero-COVID policy has probably 
stemmed the number of deaths. 

The Economist's model highlights how coun- 
tries’ official death counts often underestimate 
the true number — but that the extent of the 
underestimate varies. Excess deaths in the 
world’s richest countries might be around 
one-third above official counts, but those in 
the poorest countries could be more than 
20 times higher, although these estimates are 
extremely uncertain. 

Overall, the model suggests that lower mid- 
dle-income countries (as described by World 
Bank groupings) have suffered at least as 
severely in per-capita deaths as rich countries 
—in contrast tothe picture given by official fig- 
ures (see ‘Rich and poor’). That’s despite the 
fact that these poorer countries have younger 
populations, adds Solstad. 


Bad practice? 


Not everyone agrees with the approach. One 
vocal critic of the magazine’s pandemic mod- 
elling is Gordon Shotwell, a data scientist in 
Halifax, Nova Scotia, who published a blog 
post that called it irresponsible (see go.nature. 
com/3jpdkrs). “Models like this have the effect 
of putting a thin veneer of objectivity and 
science-y thinking over what’s basically an 
op-ed,” he wrote. 

In September, for instance, the magazine 
used its model results to say that pandemic 
deaths in Kenya were between 19,000 and 
110,000, versus an official figure of 4,746. 

“Using any model to make an estimate about 
those places I think is just bad practice,’ Shot- 
well told Nature. “You don’t learn anything by 
training a model on mostly rich countries with 
high life expectancy and applying it to poor 
countries with low life expectancy.” 

Solstad, not surprisingly, sees it differently: 
“I think itis better to provide an uncertain num- 
ber than to rely ona very certain number that 
is clearly false.” 

Very low or zero ‘official’ numbers of 
COVID-19 deaths for countries where data are 
patchy or lacking present problems of their 
own, he says. They have fuelled nonsense 
theories that people in Africa have genetic 
resistance to the disease and don’t need inter- 
national help or vaccines, for instance. 

Some demographers see Shotwell’s point of 
view, saying that applying modelling to coun- 
tries without their own deaths data is inherently 
difficult. “The process is intrinsically flawed. 
The data are a real mess and so any modelling 
effort is going to be very speculative,’ saysJon 
Wakefield, a statistician at the University of 
Washington in Seattle, who leads a modelling 
project run by the WHO to estimate the pan- 
demic’s excess death toll. “It’s very frustrating 
as the data are so limited. I’m not happy with 
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the assumptions we're being forced to make, 
but we’re doing the best we can.” 

The project, which uses a more straightfor- 
ward statistical model than The Economist to 
fill in the gaps, was scheduled to publish its 
first results in December, but they had not 
been released by mid-January as Nature went 
to press. 

Separate estimates of real-time global deaths 
from the pandemic are also produced by the 
Institute for Health Metrics and Evaluation 
(IHME), anindependent global health-research 
centre at the University of Washington. The 
IHME’s modelling says between 9 million and 
18 million people have died so far; it also tries 
to forecast how this number will grow, and 
how fast. 

Although its overall global mortality figure 
agrees with other estimates, there are sig- 
nificant differences at the national level. For 
example, the IHME puts cumulative excess 
deaths at almost 71,000 for Japan, compared 
with the official 18,000 reported. Yet The 
Economist's model estimates Japan’s excess 
deaths at between 550 and 27,000 (see ‘Model 
disagreements’). 

There are other discrepancies, too. In con- 
trast to The Economist's estimate, the IHME esti- 
mates just 8,500 excess deaths in China (with 
arange of 5,400-17,000). Meanwhile, in May, 
the IHME made headlines and drew criticism 
for suggesting that US excess deaths in the pan- 
demic up to that time were as high as 900,000 
people. That was some 300,000 greater than 
other estimates, such as from the US Centers 
for Disease Control and Prevention and the 
WMD. In October, the IHME quietly reduced the 
May figure to 670,000 after making changes to 
its modelling strategy, which some inthe field 
complain is opaque and hard to follow. 

The IHME says it will soon publish a paper 
detailing its model. It also says its initial US 
excess-death estimate was too high because it 
had not taken into account that winter deaths 
from influenza and respiratory syncytial virus 
might fall, and that it could include this informa- 
tion only once official data came in months later. 


Better estimates 


Even the best models are only as good as the 
data they rest on. Through the WHO project, 
demographers and others are searching for 
ways to improve counts and estimates of 
death tolls in countries that don’t have relia- 
ble national mortality data. Researchers have 
shown this can be estimated, for example, by 
extrapolating from smaller regions ina coun- 
try, where limited data might be available. 

In a study’ that has not yet been peer 
reviewed, Karlinsky used deaths reported in 
aregional newspaper for the Argentinian prov- 
ince of Córdoba to extrapolate a nationwide 
excess-death estimate of 120,155 from March 
2020 to August 2021, compared with official 
COVID-19 deaths for the period of 111,383. 


MODEL DISAGREEMENTS 


Models of excess deaths sometimes sharply disagree: the IHME* estimates 71,000 excess deaths for Japan, 
whereas The Economist model puts them at below 10,000. (In this chart, periods of negative excess mortality 
indicate that overall deaths were below average, even as COVID-19 deaths were being recorded.) 
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Another method is to survey a representa- 
tive sample of households to ask them about 
deaths. “This is essentially how annual number 
of deaths are estimated in countries without 
good vital registration, like Bangladesh,” 
Karlinsky says. Such surveys are under way 
in many countries and, in some cases, have 
already shown that excess mortality is several 
times higher than official COVID-19 deaths. 
This month, for instance, a team led by epi- 
demiologist Prabhat Jha at the University of 
Toronto in Canada reported the results of a 
telephone survey of adults in India conducted 
by a private polling agency tracking the pan- 
demic. The team found that there were more 
than 3 million COVID-19 deaths in India up to 
July 2021, an estimate backed up by exam- 
ining mortality data in health facilities and 
civil-registration deaths in ten states. The 
researchers — who note that other scientists 
have come to similar conclusions — estimate 
that, as of September 2021, India’s COVID-19 
deaths were 6-7 times higher than official 
statistics’. 

Mervat Alhaffar, a public-health researcher 
at the London School of Hygiene and Tropical 
Medicine (LSHTM), worked on a study that 
used an even more direct method to estimate 
deaths: counting graves. Using satellite images 
of 11 cemeteries in Aden province in Yemen, 
the study suggested that weekly burials 
increased by up to 230% between April and 
September 2020. It estimated that, as a result 
of the COVID-19 pandemic, excess deaths for 
the region were 2,120 during the same period’. 
Another LSHTM team has applied the same 
technique to count fresh graves in Mogadishu, 
Somalia, estimating that the city’s excess death 
toll between January and September 2020 was 
3,200 to 11,800 (ref. 9). 

Alhaffar says the technique is useful, but 
can’t be applied everywhere. “You need to 
engage with the locals on the ground, to under- 
stand the burial practices and make sense of 
the images,” she says. It can be hard to establish 


such connections, she adds, because peoplein 
conflict zones often fear the reaction of local 
authorities. 

And, in countries where data are scarce, 
cultural burial practices are harder to track. 
“In some places, where people might prefer 
to bury their loved ones insmaller graveyards 
nearer to their houses rather than in the big 
ones, analysing satellite images of cemeteries 
canbe much more challenging,” Alhaffar says. 

Amid the search for ways to count deaths, 
Andrew Noymer, a demographer at the Univer- 
sity of California, Irvine, says the pandemic and 
the increased demand for real-time mortality 
figures highlight a demographic shortcoming 
that goes back decades: many countries simply 
don’t collect good data on births, deaths and 
other vital statistics. “Demographers have 
been part of the problem, because we have 
helped to put band-aids on this for 60 years. 
We've developed all sorts of techniques to 
estimate demographic rates in the absence 
of hard data,” he says. 

That means the true death toll of COVID-19 
might always be disputed. “We still don’t know 
how many people died in the 1918 [flu] pan- 
demic, but I always figured we would know 
pretty well how many people would die inthe 
next one, because we live in the modern world,” 
Noymer says. “But we don’t actually, and that’s 
kind of sad for me as a demographer.” 


David Adam is a science journalist in London. 
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Clarification 

This article has been clarified to describe 
researchers’ views about COVID-19 deaths 
in China. 
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